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We study numerically the spectrum and eigenstate properties of the Google matrix of various 
examples of directed networks such as vocabulary networks of dictionaries and university World 
Wide Web networks. The spectra have gapless structure in the vicinity of the maximal eigenvalue for 
Google damping parameter a equal to unity. The vocabulary networks have relatively homogeneous 
spectral density, while university networks have pronounced spectral structures which change from 
one university to another, reflecting specific properties of the networks. We also determine specific 
properties of eigenstates of the Google matrix, including the PageRank. The fidelity of the PageRank 
is proposed as a new characterization of its stability. 



PACS numbers: 



'.20.Hh, 89.75.Hc, 05.40.Fb, 72.15.Rn 



O 



o 



> 

m 
m 

(N 
O 
O 



X 



I INTRODUCTION 

The rapid growth of the World Wide Web (WWW) 
brings the challenge of information retrieval from this 
enormous database which at present contains about 10 11 
webpages. An efficient algorithm for classification of web- 
pages was proposed in [1|, and is now known as the 
PageRank Algorithm (PRA). This PRA formed the basis 
of the Google search engine, which is used by the major- 
ity of Internet users in everyday life. The PRA allows to 
determine efficiently a vector ranking the nodes of a net- 
work by order of importance. This PageRank vector is 
obtained as an eigenvector of the Google matrix G built 
on the basis of the directed links between WWW nodes 
(see e.g. 0): 



G = aS + (1 - a)E/iV 



(1) 



Here S is the matrix constructed from the adjacency ma- 
trix Aij of the directed links of the network of size iV", 
with = 1 if there is a link from node j to node i, 



and A^ = otherwise. Namely, Si- 
Y<k A kj > 0, and Sij 



Tlik A kj 



A kj if 

1 /N if all elements in the column 
j of A are zero. The last term in Eq.([T]) with uniform ma- 
trix Eij = 1 describes the probability 1 — a of a random 
surfer propagating along the network to jump randomly 
to any other node. The matrix G belongs to the class 
of Perron-Frobenius operators. For < a < 1 it has 
a unique maximal eigenvalue at A = 1, separated from 
the others by a gap of size at least 1 — a (see e.g. @]). 
The eigenvector associated to this maximal eigenvalue is 
the PageRank vector, which can be viewed as the steady- 
state distribution for the random surfer. Usual WWW 
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networks correspond to very sparse matrix A and re- 
peated applications of G on a random vector converges 
quickly to the PageRank vector, after 50 — 100 iterations 
for a = 0.85 which is the most commonly used value @]. 
The PageRank vector is real nonnegative and can be or- 
dered by decreasing values pj , giving the relative impor- 
tance of the node j. It is known that when a varies, all 
eigenvalues evolve as a\i where Aj are the eigenvalues for 
a = 1 and i = 2, ...N, while the largest eigenvalue Ai = 1, 
associated with the PageRank, remains unchanged 0]. 

The properties of the PageRank vector for WWW have 
been extensively studied by the computer science com- 
munity and many important properties have been estab- 
lished • For example, it was shown that pj decreases 
approximately in an algebraic way pj ~ with the 

exponent /3 « 0.9 It is also known that typically for 
the Google matrix of WWW at a = 1 there are many 
eigenvalues very close or equal to A = 1, and that even 
at finite a < 1 there are degeneracies of eigenvalues with 
A = a (see e.g. Q). 

In spite of the important progress obtained during 
these investigations of PageRank vectors, the spectrum 
of the Google matrix G was rarely studied as a whole. 
Nevertheless, it is clear that the structure of the network 
is directly linked to this spectrum. Eigenvectors other 
than the PageRank describe the relaxation processes to- 
ward the steady-state, and also characterize various com- 
munities or subsets of the network. Even if models of 
directed networks of small-world type Q have been an- 
alyzed, constructed and investigated, the spectral prop- 
erties of matrices corresponding to such networks were 
not so much studied. Generally for a directed network 
the matrix G is nonsymmetric and thus the spectrum of 
eigenvalues is complex. Recently the spectral study of 
the Google matrix for the Albert-Barabasi (AB) model 
[Toj | and randomized university WWW networks was per- 
formed in [Til] - For the AB model the distribution of 
links is typical of scale- free networks 0. The distribu- 
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tion of links for the university network is approximately 
the same and is not affected by the randomization proce- 
dure. Indeed, the randomization procedure corresponds 
to the one proposed in fl2T ] and is performed by taking 
pairs of links and inverting the initial vertices, keeping 
unchanged the number of ingoing and outgoing links for 
each vertex. It was established that the spectra of the 
AB model and the randomized university networks were 
quite similar. Both have a large gap between the largest 
eigenvalue Ai = 1 and the next one with | A.2 1 ~ 0.5 at 
a = 1. This is in contrast with the known property of 
WWW where A2 is usually very close or equal to unity 
0, IH . Thus it appears that the AB model and the ran- 
domized scale- free networks have a very different spectral 
structure compared to real WWW networks. Therefore it 
is important to study the spectral properties of examples 
of real networks (without randomization). 

In this paper, we thus study the spectra of Google 
matrices for the WWW networks of several universities 
and show that indeed they display very different proper- 
ties compared to random scale-free networks considered 
in ll| . We also explore the spectra of a completely dif- 



ferent type of real network, built from the vocabulary 
links in various dictionaries. In addition, we analyze the 
properties of eigenvectors of the Google matrix for these 
networks. A special attention is paid to the PageRank 
vector and in particular we characterize its sensitivity to 
a through a new quantity, the PageRank fidelity. 

The paper is organized as follows. In Section II we give 
the description of the university and vocabulary networks 
whose Google matrices we consider. The properties of 
spectra and eigcnstates are investigated in Section III. 
The fidelity of PageRank and its other properties are 
analyzed in Section IV. Section V explores various models 
of random networks for which the spectrum can be closer 
to the one of real networks. The conclusion is given in 
Section VI. 



II DESCRIPTION OF NETWORKS OF 
UNIVERSITY WWW AND DICTIONARIES 

In order to study the spectra and eigenvectors of 
Google matrices of real networks, we numerically ex- 
plored several systems. 

Our first example consists in the WWW networks of 
UK universities, taken from the database The ver- 
tices are the HTML pages of the university websites in 
2002. The links correspond to hyperlinks in the pages 
directing to another webpage. To reduce the size of 
the matrices in order to perform exact diagonalization, 
only webpages with at least one outlink were considered. 
There are still dangling nodes, despite of this selection, 
since some sites have outlinks only to sites with no out- 
link. We checked on several examples that the general 
properties of the spectra were not affected by this reduc- 



tion in size. We present data on the spectra from five 
universities: 

• University of Wales at Cardiff (www.uwic.ac.uk), 
with 2778 sites and 29281 links. 

• Birmingham City University (www.uce.ac.uk); 
10631 sites and 82574 links. 

• Kcele University (Staffordshire) (www.kcclc.ac.uk); 
11437 sites and 67761 links. 

• Nottingham Trent University (www.ntu.ac.uk); 
12660 sites and 85826 links. 



• Liverpool John 
(www.livjm.ac.uk) ; 
links. 



Moorcs 
13578 sites 



University 
and 111648 



A much larger sample of university networks from the 
same database was actually used, including universities 
from the US, Australia and New Zealand, in order to 
insure that the results presented were representative. 

As opposed to the full spectrum of the Google ma- 
trix, the PageRank can be computed and studied for 
much larger matrix sizes. In the studies of Section IV, 
we therefore included additional data from the univer- 
sity networks of Oxford in 2006 (www.oxford.ac.uk) with 
173733 sites and 2917014 links taken from [l3|, and the 
network of Notre Dame University from the US taken 
from the database [3] with 325729 sites and 1497135 
links (without removing any node). 

In addition, we also investigated several vocabulary 
networks constructed from dictionaries; the network data 
were taken from flBl ]. 

• Roget dictionary (1022 vertices and 5075 links) [lij ]. 
The 1022 vertices correspond to the categories in 
the 1879 edition of Roget's Thesaurus of English 
Words and Phrases. There is a link from category X 
to category Y if Roget gave a reference to Y among 
the words and phrases of X, or if the two categories 
are related by their positions in the book. 

• ODLIS dictionary (Online Dictionary of Library 
and Information Science), version December 2000 
(2909 vertices and 18419 links). 

A link (X,Y) from term X to term Y is created if 
the term Y is used in the definition of term X. 

• FOLDOC dictionary (Free On-Line Dictionary of 
Computing) [lU (13356 vertices and 120238 links) 
A link (X,Y) from term X to term Y is created if 
the term Y is used in the definition of term X. 

Distribution of ingoing and outgoing links for these 
university WWW networks is similar to those of much 
larger WWW networks discussed in @, 0, • An exam- 
ple is shown in the Appendix for the network of Liver- 
pool John Moores University, together with data from 
AB models discussed in ll| (see Fig. [T2"|) . 



Ill PROPERTIES OF SPECTRUM AND 
EIGENSTATES 

To study the spectrum of the networks described in 
the previous section, we construct the Google matrix G 
associated to them at a = 1. After that the spectrum 
\i and right cigcnstates ipi of G (satisfying the relation 
Gipi = Xiipi) are computed by direct diagonalization us- 
ing standard LAPACK routines. Since G is generally a 
nonsymmetric matrix for our networks, the eigenvalues 
Aj are distributed in the complex plane, inside the unit 
disk |Ai| < 1. 

The spectrum for our eight networks is shown in FigfT] 
An important property of these spectra is the presence of 
eigenvalues very close to A = 1 and moreover wc find that 
A = 1 eigenvalue has significant degeneracy. It is known 
that such an exact degeneracy is typical for WWW net- 
works (see e.g. @, HJ). In addition to this exact degen- 
eracy, there are quasidegenerate eigenvalues very close 
to A = 1. It is important to note that these features 
are absent in the spectra of random networks studied 
in (ll| based on the AB model and on the randomiza- 
tion of WWW university networks, where the spectrum 
is characterized by a large gap between the first eigen- 
value Ai = 1 and the second one with | A2 1 ~ 0.5. For ex- 
ample, the spectrum shown in FigfT] panel H corresponds 
to the same university whose randomized spectrum was 
displayed in Fig.l (bottom panel) in [lljj. Clearly the 
structure of the spectrum becomes very different after 
randomization of links. Another property of the spectra 
displayed in Fig[TJ that we want to stress is the presence 
of clearly pronounced structures which are different from 
one network to another. The structure is less pronounced 
in the case of the three spectra obtained from dictionary 
networks. In this case, the spectrum is flattened, be- 
ing closer to the real axis. In contrast, for the WWW 
university networks, the spectrum is spread out over the 
unit disk. However, there is still a significant fraction of 
eigenvalues close to the real axis. Wc understand this 
feature by the existence of a significant number of sym- 
metric ingoing and outgoing links (48 % in the case of 
the Liverpool John Moores University network), which 
is larger compared to the case of randomized university 
networks considered in 11|. 

To characterize the spectrum, we introduce the 
relaxation rate 7 defined by the relation |A| = 
exp(— 7/2). For characterization of eigenvectors ^ ( j ) , 
we use the PArticipation Ratio (PAR) defined by £ = 
(£,- hh(j')| 2 )7£i This quantity gives the effec- 

tive number of vertices of the network contributing to a 
given eigenstate tpi ; it is often used in solid-state systems 
with disorder to characterize localization properties (see 
e.g. [l^l). The dependence of the density of states W{-y) 
in 7, which gives the number of eigenstates in the inter- 
val [7, 7 + dj], is shown in Figs l2l3l4l5l (top panels). The 
normalization is chosen such that J Q ^(7)^7 = 1, cor- 
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FIG. 1: Distribution of eigenvalues Ai of Google matrices in 
the complex plane at a — 1 for dictionary networks: Roget (A, 
N=1022), ODLIS (B, N=2909) and FOLDOC (C, N=13356); 
university WWW networks: University of Wales (Cardiff) (D, 
N=2778), Birmingham City University (E, N=f063f), Keele 
University (Staffordshire) (F, N=II437), Nottingham Trent 
University (G, N=I2660), Liverpool John Moores University 
(H, N=I3578)(data for universities are for 2002). 



FIG. 2: (Color online) Left: Roget dictionary, a = 1. Top 
panel: normalized density of states W (black) obtained as 
a derivative of a smoothed version of the integrated density 
(smoothed over a small interval A7 varying with matrix size), 
integrated density is shown in red (grey) . Bottom panel: 
PAR of eigenvectors as a function of 7; degeneracy of A = 1 is 
18 (note that the value VK(0) corresponds to eigenvalues with 
|A| = 1). Right: ODLIS dictionary, same as left; degeneracy 
of A = 1 is 4. 

responding to the total number of eigenvalues N (equal 
to the matrix size) . We also show the integrated version 
of this quantity in the same panels. In the same Figs 
we show the PAR £ of the eigenstates as a function of 7 
(bottom panels). 




FIG. 4: (Color online) Left: Birmingham City University, 
same as Fig. degeneracy of A = 1 is 71; Right: Keele 
University (Staffordshire), same as left; degeneracy of A = 1 
is 205. 

states is much more inhomogcncous in 7. Even if a broad 
maximum is visible, there are sharp peaks at certain val- 
ues of 7. The sharpest peaks correspond to exact degen- 
eracies at certain complex values of A. The degeneracies 
are especially visible at the real values A = 1/2, A = 1/3 
and other 1/n with integer values of n. We attribute 
this phenomenon to the fact that the small number of 
links gives only a small number of different values for 
the matrix elements of the matrix G. For the university 
networks, the degeneracy at A = 1 is much larger than 
in the case of dictionaries. The integrated densities of 
states show visible vertical jumps which correspond to 
the degeneracies; their growth saturates at 7 « 7 show- 
ing that about 30 — 50% of the eigenvalues are located in 
the vicinity of A = 0. 



FIG. 3: (Color online) Left: FOLDOC dictionary, same as 
Fig. [2] degeneracy of A = 1 is 1; Right: University of Wales 
(Cardiff), same as left; degeneracy of A = 1 is 69. 

It is clear that for the dictionary networks the density 
of states W depends on 7 in a relatively smooth way, 
with a broad maximum at 7 ~ 1 — 2. The distribution 
of PAR has also a maximum at approximately the same 
values. The case of the dictionary FOLDOC is a bit spe- 
cial, showing a bimodal distribution which is also clearly 
seen in the dependence of £ on 7. This comes from the 
fact that the distribution of eigenvalues in Fig. [1] (panel 
C) is highly asymmetric with respect to the imaginary 
axis. The latter case has also no degeneracy at A = 1. 
In these three networks the density of states decreases 
for 7 approaching 0. We note that the integrated version 
of the density of states reaches a plateau for 7 > 6 — 7. 
This saturation value is less than 1, meaning that a cer- 
tain nonzero fraction of eigenvalues are extremely close 
to A = 0. 

For the WWW university networks, the density of 
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FIG. 5: (Color online) Left: Nottingham Trent University, 
same as Fig. [2] degeneracy of A = 1 is 229. Right: Liverpool 
John Moores University, same as left; degeneracy of A = 1 
is 109; other degeneracy peaks correspond to A = 1/2 (16), 
A = 1/3 (8); A = 1/4 (947), A = 1/5 (97), being located at 
7 = —2 In A; other degeneracies are also present, e.g. A = 
l/\/2 (41). 

The PAR distribution for the university networks fluc- 
tuates strongly, even if a broad maximum is visible. Typ- 
ical values have £ rj 100, which is small compared to the 
matrix size N ~ 10 4 . This indicates that the majority 
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of eigenstates are localized on certain zones of the net- 




FIG. 6: Cloud of eigenvalues for Liverpool John Moores Uni- 
versity, a = 1. Circles: full matrix TV = 13578. Stars: trun- 
cated matrix of size 8192 (left) and 4096 (right). 

The exact G matrix diagonalization requires signifi- 
cant computer memory and is practically restricted to 
matrix size TV of about TV < 30000. However, real net- 
works such as WWW networks can be much larger. It 
is therefore important to find numerical approaches in 
order to obtain the spectrum of large networks using ap- 
proximate methods. A natural possibility is to order the 
sites through the PageRank method and to consider the 
spectrum of the (properly renormalizcd) truncated ma- 
trix restricted to the sites with PageRank larger than 
a certain value. In this way, the truncation takes into 
account the most important sites of the network. The 
effect of such a truncation is shown in Fig. [5] for the 
largest network of our sample. The numerical data show 
that the global features of the spectrum are preserved 
by moderate truncation, but individual eigenvalues devi- 
ate from their exact values when more than 50% of sites 
arc truncated. Probably future developments of this ap- 
proach are needed in order to be able to truncate a larger 
fraction of sites. 



IV FIDELITY OF PAGERANK AND ITS OTHER 
PROPERTIES 



In the previous section we studied the properties of 
the full spectrum and all eigenstates of the G matrix for 
several real networks. The PageRank is especially impor- 
tant since it allows to obtain an efficient classification of 
the sites of the network [l|, Q ■ Since the networks usually 
have small number of links, it is possible to obtain the 
PageRank by vector iteration for enormously large size 
of networks as described in 0, Q ■ 



FIG. 7: (Color online) PAR £ of PageRank as a function of a 
for University of Wales (Cardiff) (black/dashed), Notre-Dame 
(blue, dotted), Liverpool John Moores University (red/long 
dashed) and Oxford (green/solid) Universities (curves from 
top to bottom at a = 0.6). Network sizes vary from TV = 
2778 to TV = 325729. Inset is a zoom for data from Oxford 
University close to a = 1. 



Due to this significance of the PageRank, it is impor- 
tant to characterize its properties. In addition, it is im- 
portant to know how sensitive the PageRank is with re- 
spect to the Google parameter (damping parameter) a in 
Eq. ([I]). The localization property of the PageRank can 
be quantified through the PAR £ defined above. The de- 
pendence of £ on a is shown in FigjT] for four University 
WWW networks, including two from Fig Q] (panels D and 
H) and two of much larger sizes (Notre Dame and Ox- 
ford). For a ~ > the PAR goes to the matrix size since 
the G matrix is dominated by the second part of Eq. [T] 
However, in the interval 0.4 < a < 0.9 the dependence 
on a is rather weak, indicating stability of the PageRank. 
For 0.9 < a < 1 the PAR value has a local maximum 
where its value can be increased by a factor 2 — 3. We 
attribute this effect to the existence of an exact degen- 
eracy of the eigenvalue A = 1 at a = 1 , discussed in the 
previous section. In spite of this interesting behavior of £ 
in the vicinity of a — 1, the value of £, which gives the ef- 
fective number of populated sites, remains much smaller 
than the network size. In other models considered in 



111 |20j , a derealization of the PageRank was observed 
for some a values, so that £ was growing with system size 
TV. For the WWW university networks considered here, 
derealization is clearly absent (network sizes in Fig. [7| 
vary over two order of magnitudes). This is in agreement 
with the value of the exponent j3 « 0.9 for the PageR- 
ank decay, which was found for large samples of WWW 
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data in [1, 0]- Indeed, for that value of /3, PAR should 
be independent of system size. 




with a and a' is shown in Fig. [§] The fidelity reaches its 
maximum value / = 1 for a = a! . According to Fig. [5] 
(right panel) , the stability plateau where fidelity remains 
close to I, indicating stability of PageRank, is broadest 
for a = 0.5. This is in agreement with previous results 
presented in [23| , where the same optimal value of a was 
found based on different arguments. 



V SPECTRUM OF MODEL SYSTEMS 



The results obtained in compared to those pre 



FIG. 8: (Color online) Some PageRank vectors pj for Notre- 
Dame university (left panel) and Oxford (right panel). From 
top to bottom at log 10 (i) = 5: a=0.49 (black), 0.59 (red), 0.69 
(green), 0.79 (blue) , 0.89 (violet) and 0.99 (orange). Dashed 
line indicates the slope -1. 

Our data for PageRank distribution also show its sta- 
bility as a whole for variation of a in the interval 0.4 < 
a < 0.9, as it is shown in Fig. [H 




D.2 0.3 0.4 0.5 0.6 0.7 0.8 0-B 



FIG. 9: (Color online) PageRank fidelity f(a, a') for Notre- 
Dame university (N = 325729); left panel: f(a,a' = 0.85) = 
|(^(a)|i/>(0.85))| 2 (see Eq. ©); right panel: color density plot 
of f(a, a' ) . 

The sensitivity of the PageRank with respect to a can 
be more precisely characterized through the PageRank 
fidelity defined as 



sentcd in the previous section show that while the spec- 
trum of the network has a large gap between A = 1 
and the other eigenvalues, still certain properties of the 
PageRank can be similar in both cases (e.g. the expo- 
nent fJ). In fact the studies performed in the computer 
science community were often based on simplified mod- 
els, which can nevertheless give the value of (3 close to the 
one of real networks. For example, the model studied by 
Avrachenkov and Lebedev [B| allows to obtain analytical 
expressions for /3 with a value close to the one obtained 
for WWW. It is interesting to see what are the spectral 
properties of this model. In Fig. [TO] we show the spec- 
trum for this model for a = 0.85. Our data show that 
this model has an enormous gap, thus being very different 
from spectra of real networks shown in Fig. ffl 




f(a, a') = \J2M3,<x)M3,a')f 



(2) 



where a) is the eigenstate at A = 1 of the Google 
matrix G with parameter a in Eq. ([T]); here the sum 
over j runs over the network sites (without PageRank 
reordering). We remind that the eigenvector t/>i(j, a) is 
normalized by Ylj V'lO'i a ) 2 = 1- Fidelity is often used in 
the context of quantum chaos and quantum computing to 
characterize the sensitivity of wavefunctions with respect 
to a perturbation [21, 22 1. The variation of this quantity 



FIG. 10: Spectrum of eigenvalues A in the complex plane 
for the Avrachenkov-Lebedev model of [f|, with TV = 2 11 
(network size), a = 0.85, m = 5 outgoing links per node. 
Multiplicity of links is taken into account in the construction 
of G. 



The above results, together with those of show 
that many commonly used network models are charac- 
terized by a large gap between A = 1 and the second 
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eigenvalue, in contrast with real networks. In order to 
build a network model where this gap is absent, we in- 
troduce here what we call the color model. It is an ex- 
tension of the AB model, that allows to obtain results for 
the spectral distribution that are closer to real networks. 
We divide the nodes into n sets ("colors"), allowing n to 
grow with network size. Each node is labeled by an inte- 
ger between and n — 1. At each step, links and nodes 
are added as in the AB model but also with probability 
r\ the new node is introduced with a new color. The only 
links authorized between nodes are links within each set. 
Such a structure implies that the second eigenvalue of 
matrix G is real and exactly equal to a [24[. The colors 
correspond to communities in the network. 

In order to have a more realistic model, we allow for 
the rule for links to be broken with some probability e. 
That is, at each time step an link between two nodes is 
chosen at random according to the rules of the AB model. 
Then if it agrees with the color rule above it is used; if 
it does not then with probability 1 — e it is just omitted, 
and with probability s it is nevertheless added. 

The spectrum of this color model is shown in Fig. [11] 
for a = 0.85. The second eigenvalue is now exactly at 
A = 0.85, demonstrating the absence of a gap. There is 
also a set of eigenvalues which is located on the real line, 
but the majority of states remains inside a circle |A| < 0.3 
as in the AB model. 
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FIG. 11: Spectrum of eigenvalues A in the complex plane 
for the color model, N = 2 13 , p = 0.2, q = 0.1, a = 0.85. 
Nodes are divided into n color sets labeled from i = to 
n — 1; nodes and links are created according to AB model; 
only authorized links are links within a color set i. This rule 
is broken with probability e = 10~ 3 . We start with 3 color 
sets; with probability r\ a new color is introduced (we take 
rj = 10 -2 ). In the example displayed, when the number of 
nodes reaches N, n — 83 colors. 



Thus the color model allows to eliminate the gap, but 
still the distribution of eigenvalues A in the complex plane 
remains different from the spectra of real networks shown 
in Fig. [TJ the structures prominent in real networks are 
not visible, and eigenvalues in the gap are concentrated 
only on the real axis or close to it. 

VI CONCLUSION 

In this work we performed numerical analysis of the 
spectra and eigenstates of the Google matrix G for sev- 
eral real networks. The spectra of the analyzed networks 
have no gap between first and second eigenvalues, in con- 
trast with commonly used scale- free network models (e.g. 
AB model). The spectra of university WWW networks 
are characterized by complex structures which are differ- 
ent from one university to another. At the same time, 
PagcRank of these university networks look rather simi- 
lar. In contrast, the Google matrices of vocabulary net- 
works of dictionaries have spectra with much less struc- 
ture. 

These studies show that usual models of random scale- 
free networks miss many important features of real net- 
works. In particular, they are characterized by a large 
spectral gap, which is generally absent in real networks. 
We attribute the physical origin of this gap to the known 
property of small- world and scale- free networks that only 
logarithmic time (in system size) is needed to go from any 
node to any other node (see e.g. Q). Due to that, the 
relaxation process in such networks is fast and the gap, 
being inversely proportional to this time, is accordingly 
very large. In contrast, the presence of weakly coupled 
communities in real networks makes the relaxation time 
very large, at least for certain configurations. Therefore, 
it is desirable to construct new random scale-free models 
which could capture in a better way the actual properties 
of real networks. The color model presented here is a first 
step in this direction. We note that Ulam networks built 
from dynamical maps can capture certain properties of 
real networks in a relatively good manner [2(J in 
these latter networks, it is possible to have a derealiza- 
tion of the PagcRank when a or map parameters vary; 
we didn't observe such feature here. 

Indeed, our data show that the PageRank remains lo- 
calized for all values of a > 0.3. We also showed that 
the use of the fidelity as a new quantity to characterize 
the stability of PageRank enables to identify a stability 
plateau located around a = 0.5. 

We think that future investigation of the spectral prop- 
erties of the Google matrix will open new access to iden- 
tification of important communities and their properties 
which can be hidden in the tail of the PageRank and 
hardly accessible to classification by the PageRank algo- 
rithm. Furthermore, the degeneracies at various values 
of A and the characteristic patterns directly visible in 
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the spectra of the Google matrix should allow to identify 
other hidden properties of real networks. 

We thank Leonardo Ermann and Klaus Frahm for dis- 
cussions, and CALcul en Midi-Pyrenees (CALMIP) for 
the use of their supercomputers. 
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APPENDIX 

Here we show the distributions of links for the AB 
model discussed in [ll[ and for the university WWW 
network (panel H of Fig. [I}. 




FIG. 12: Cumulative distribution of ingoing links P^ n (k) (top 
panel) and of outgoing links P° ut (k) (bottom panel) for the 
AB model with vector size N = 2 14 , for q = 0.1 (black/solid) 
and q = 0.7 (red/dashed), data are averaged over 80 realiza- 
tions of AB model, and for the network of Liverpool John 
Moores University with N = 13578, (panel H in Fig. []} 
(blue/dotted). Average number of in- or outgoing links is 

< k >= 6.43 for q = 0.1, < k >= 14.98 for q = 0.7, 

< k >= 8.2227 for LJMU. Dashed straight line indicates the 
slope -1. Logarithms are decimal. 



